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Abstract 

Ph : 

• We study fixation probabilities and times as a consequence of neutral genetic drift in subdivided populations, motivated by a model 
m y of the cultural evolutionary process of language change that is described by the same mathematics as the biological process. We 
O focus on the growth of fixation times with the number of subpopulations, and variation of fixation probabilities and times with 
1 initial distributions of mutants. A general formula for the fixation probability for arbitrary initial condition is derived by extending 
i , a duality relation between forwards- and backwards-time properties of the model from a panmictic to a subdivided population. 
From this we obtain new formulae formally exact in the limit of extremely weak migration, for the mean fixation time from an 
arbitrary initial condition for Wright's island model, presenting two cases as examples. For more general models of population 
^ subdivision, formulas are introduced for an arbitrary number of mutants that are randomly located, and a single mutant whose 
position is known. These formulae contain parameters that typically have to be obtained numerically, a procedure we follow for two 
C*") contrasting clustered models. These data suggest that variation of fixation time with the initial condition is slight, but depends 
^— ' strongly on the nature of subdivision. In particular, we demonstrate conditions under which the fixation time remains finite even 



o 



in the limit of an infinite number of demes. In many cases — except this last where fixation in a finite time is seen — the time to 
fixation is shown to be in precise agreement with predictions from formulae for the asymptotic effective population size. 
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1. Introduction 

Genetic drift is a generic term for fluctuations in allele 
frequencies that arise by sampling a finite population to 
produce offspring in the next generation. In the absence of 
mutation, these fluctuations can lead to extinction of some 
alleles, ultimately causing one allele to fix. One can there- 
fore ask whether it is feasible for some trait to propagate 
across an entire population by neutral genetic drift alone. 
In the s imple st mathematical mod els, such as th ose due to 
Fisherl (|l930f) . IWrightJ (|l93lh and iMoranl (|l958h . individ- 



uals from the entire population mate randomly and it is 
known that fixation time (measured in units of the expected 
lifetime of one individual in the population) increases lin- 
early with the populati on size |Kimura and OhtaL Il969t 



Crow and Kimural . Il970f l . This growth law calls into ques 



tion the viability of neutral genetic drift as a mechanism 
for population-level change, on the grounds that changes 
in large populations are simply far too slow. An important 
question then arising is whether non-random mating — for 
example, that seen in subdivided populations — can reduce 



fixation times in large populations. 

This very same question has recently arisen in the con- 
text of the cultural evolutionary phenomenon of language 
change, in which the unit of variation is some aspect of 
spoken language, such as a vowel sound, through which 
one can distinguish different dialects of the same language, 
but which are otherwise functionally equivalent. A simple, 
agent-based model o f language reception and reproduction 



(Ba xter et al. . 2006f) . has a mathematical description that 



coincides with that of neutral genetic drift in a subdivided 
population, although a number of details of the underlying 
evolutionary processes are rather different — in particular, 
the language does not evolve due to genetic changes in the 
speakers, but rather the frequencies of linguistic variants in 
the population of utterances change over time in a manner 
akin to genetic drift. The key points here, however, are that 
fixation of an allele corresponds to a linguistic innovation 
being adopted as a community's convention, and that inter- 
actions between speakers map to migrations between large 
subpopulations. Thus here even a linear growth in fixation 
time with the number of speakers (each one correspond- 
ing to a single subpopulation) is untenable: a change that 



becomes established in a small clique in a few days would 
then require several hundred years to propagate across a 
modestly-sized society. To see why this is a problem, one 
should note that this long times cale roughly corresponds 
with the age of language itself (|Christiansen and Kirbv . 
2003), whilst society-wide language change has been seen 



to occur within one or two human gen erations, for example 



in the case of new dialect formation (| Gordon et al 
TrudgiliEool . 



2004 



It is therefore tempting to suggest that some form of 
selection, i.e., intrinsic fitness of some linguistic variants 
over others, is required for an innovation to spread rapidly 
across an entire society. To be able to state this categori- 
cally, however, we must understand quantitatively how fix- 
ation times and probabilities depend on the network of in- 
teractions between speakers in the society (which, in the 
biological interpretation of our model, corresponds to a set 
of migration rates between geographically separated sub- 
populations) and the overall size of that network (the num- 
ber of subpopulations). We also need to explore how these 
fixation statistics vary with changes in the initial condition 
(distribution of mutants across the total population) since 
this is the information provided in the historical data that 
we hope to use to infer the propagation mechanisms of lin- 
guistic innovations. In this work, we aim to develop this 
required understanding. 

A recurring theme in the considerable literature on ge- 
netic dr i ft in s ubdivided populations — this beginning with 
IWrightl ( 1931) and continuing through to the pres ent day 
(see e.g Charlesworth et al. , 20031 : Wakelev . 20051 for re- 
cent reviews) — is that properties of fixation are not strongly 
affected by such subdivision. For example, when migra- 
tion is conservative, the probability a mutant allele fixes, 
whether selectively advantageous or not, is the same as that 
as in an ideal panm i ctic population with the same over- 
all size ( Maruvamal . Il970bl ISlatkirl 1981 ). In particular 
this means that no matter where mutants are initially po- 
sitioned, they fix with equal probability. For this state of 
affairs to change, one either has either has to introduce ad- 
ditional proces ses — such as extinction and recolonisation 
(|BartonLll993l) — or relax th e assu mption of conservative 
migration ( Lieber man et al ., 2005) in which case certain 
geographical structures can result in an allele with even a 
small selective advantage fixing with near certainty. Whilst 
the dependence of fixation probability on the initial mutant 
distribution can be established rather easily (see below), 
the corresponding variation of fixation time seems harder 
to obtain and progress in this direction forms the bulk of 
the present work. 

A continued study of the literature reveals that the con- 
ce pt of effective population size — which again goes back 
to lWrighd il93l|)— has proved useful in characterising the 
overall fixation timescale. Essentially, the idea is that many 
properties of neutral genetic drift in a subdivided popu- 
lation comprising in total N instances of a gen^3 are the 



same as those for an ideal panmictic population with an 
effective number of gene instances N e . 

One prominent definition of effective population size re- 
lates to the change per generation in the varian ce of a 
mutant allele frequency (jCrow and KimuraL 1197(1 . Let x 
be the frequency of mutants present in a population at 
some time, and x' their frequency after a single generation. 
Then, the variance effective size can be defined as N e = 
x(l — x)/Var(a;'). If this effective population size is large, 
one can then use a diffusion approximation to model the 
evolution of the distribution of mutant frequencies. One 
can then calculate the mean time to fixation to be 



-2N, 



(1 - x) ln(l - x) 



(1) 



as wa s shown in the classic work by iKimura and Ohtal 
(1969). Thus if one has the variance effective population 
size for the kind of large, subdivided population that is 
of interest in the present work, Eq. (1) can be used to 
estimate the fixation time. In principle such an effective 
population size could be extracted from the diffusion ap- 
proximation that applies to genetic drift in a subdivided 
population. However, one typically finds that the effec- 
tive sixe_derjejids_on the_mutm x (see 
e.g. Roze and Rousset . 20031 : iRousseti 120031 ) which makes 
the resulting expressions difficult to handle, unless ad- 
ditional approximations ar e made such as that used by 
Cherry and Wakelevl (|2003l ). Instead, we exploit here the 
fact that equilibrium values of genetic variables, such 
as the frequency of mutants, are all approached expo- 
nentially with a common time const a nt as ymptotically 
( Whitlock and Bartonl Il997t IRousseti 12004 . This time 
constant can then be used to define an effective population 
size. 

One way to do this is to consider the change per gener- 
ation in the probability that two individuals sampled from 
the populatio n are identical by descent, i.e., share a com - 
mon ancestor ([Whitlock and Bartonlll997l : lRoussetJ . l2004r ). 
Let us recapitulate here the deriv ation of the for mula for 
asymptotic effective size given by IRousseti (|2004l ) for the 
model of population subdivision that will be used through- 
out this work. This model comprises L demes, each having 
a fixed subpopulation size Ni . After a generation of repro- 
duction and migration, the probability that an individual 
in deme i is an offspring of a parent in deme j is , hence 
J2j Hij = 1 for all i. Given the set of probabilities Fij that 
two individuals, one sampled from deme i, one from deme 
j, share a common ancestor, one finds one generation later 
that the corresponding set of probabilities are given by 



F' 
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fa. 



(2) 



1 It is traditional in the population genetics literature to talk in 



terms of a number of diploid organisms, in which case the number of 
genes N is twice the number of organisms. Since ploidy is irrelevant in 
the cultural evolutionary application, we shall suppress these factors 
of two. 



This form arises because if the parents of the two sampled 
individuals came from different demes k and £, they are 
identical by descent with probability F^i, whereas if the 
parents are from the same deme k they have a probability 
1/iVfc of being the same individual (and hence genetically 
identical in the absence of mutation) whereas otherwise the 
probability they are identical is (1 — l/Nk)Fkk- 

To simplify this expression, we introduce the important 
quantity Q* which is defined as follows. Consider some sub- 
set of the present-day population, and trace its ancestry 
backwards in time until a single common ancestor is found. 
Now, if one keeps tracing backwards in time, this ancestor 
will hop from deme to deme and eventually the probabil- 
ity that it is to be found in deme i approaches the station- 
ary value Q*. Since a hop from deme i to j occurs with a 
probability /x^ per generation, these stationary probabili- 
ties satisfy the equation 



(3) 



Multiplying both sides of (2) by Q*Q* and summing over 
all i and j, one then finds that 

L 
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(4) 



where F = J^ij QtQjFij- As t — > oo, the probability that 
any pair of individuals share a common ancestor approaches 
unity under neutral evolution. As this limit of infinite time 
is approached, one can unambiguously define an asymptotic 
effective size of the subdivided population through 
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lim 



AF 



lim 



(5) 



where t is the number of generations that have elapsed since 
imposing some initial condition. One way to interpret this 
equation is as an average of coalescence rates 1 /N of two 
lineages present in the same deme i in the limit t — > oo g iven 
that they have not previously coalesced ( Roussetl 12004 ). 

If migration is a fast process, one can envisage that the 
identity probability Fij is roughly the same for all pairs of 
demes i and j. Under such conditions, the formula simplifies 
to 

1 _^(Q*) 2 



N K 



N, 



(6) 



This formula w as obtained by an independent means by 
Nagvlakil(jl980h . and was shown to be valid in the limit of 
infinite deme sizes when migration rates per generation sat- 
isfy limjVj— >oo /J>ijN = oo, an e xpres sion that defines a fast 
migration limit (see Wakelevl . [2005, for a fuller discussion 
of the various lim its that are of interest) . For intermediate 
migration rates. IWhitlock and Barton! ( 1997 ) presented a 
formula in which the denominator of (5) is replaced by 1 — F 
where F is the identity probability averaged uniformly over 
all pairs of individuals in the population. 

In this work we are almost exclusively interested in the 
slow migration limit, where deme sizes again again ap- 



proach infinity, but one has limjVj— >oo IHjNi < oo. There 
are several reasons for this. Firstly, this is the limit that 
naturally a rose in the model of language change previously 
described ( Baxter et al. . 2006). In a population genetics 
context this limit is also relevant when the subpopulation 
sizes are large and one is interested in the evolution over a 
number of generations that is of the order of these subpop- 
ulation sizes. Finally, it is a limit in which any deviation 
from panmictic behaviour — such as sensitivity to an ini- 
tial distribution of mutants — would be most likely to arise, 
and thus wort h exploring fro m a more theoretical point of 



view (see, e.g.. lSlatkinl . Il98ll for a discussion of the differ- 



ent phenomenology in the fast and slow migration limits). 
We remark that this is not the only slow migration limit 
that can be taken: one can also take fcj ~ /j, — ► at fixed 
population size s and measure time in units of 1/fi (see, e.g., 



Wakelevl . l2004f ). For clarity, we reiterate that throughout 



most of this work, we deal with a slow migration limit in 
which deme population sizes N are infinite, and migration 
rates are inversely proportional to this size. Furthermore, 
for reasons to be discussed in the next section, we will most 
often be dealing with the extreme case where the coeffi- 
cients are vanishingly small. 

In this limit, it is unclear whether one can simply use the 
expression (5) for the effective size that appears in Eq. (1) 
for the fixation time. Tracing the ancestry of a state of fixa- 
tion backwards in time, the extreme slow migration regime 
could lead to a situation in which the majority of the rele- 
vant coalesence events occur long before the time at which 
the mean coalescence rate given by the right-hand size of (5) 
is reached. We also remark that it has also been argued that 
only when lineages are able to equilibrate through a rapid 
migration process between coalescent events, is it appropri- 
ate to characteri se the coalescence rates by a single effective 
popu lation size ( Nordborg and Kronel 2002 ; Siodin et al 



This fact is particularly relevant when considering a 
final state of fixation, since one must trace the ancestry of 
the entire population. Finally, whether or not the effective 
population size turns out to provide good estimates of fixa- 
tion times in the slow migration limit, it does not provide a 
framework for understanding how fixation times vary with 
the initial distribution of mutants. 

To assess the utility of the asymptotic effective popula- 
tion size in characterising fixation timescales, and to inves- 
tigate the effect of varying the initial condition, we develop 
a formalism within which the spatial distribution of mu- 
tants as a function of time can (in principle, at least) be cal- 
culated exactly given any initial distribution and model of 
a subdivided population undergoing neutral genetic drift. 
Specialising to the case of a final state of fixation, we can ex- 
tract via Eq. (1) a measure of effective population size and 
compare with its asymptotic counterpart using Eq. (5) . As 
has been hinted above, the approach we use is based o n the 
backward time coalescent process ([Kingmanl Il982f ) that 
has found many applications i n the context of drift in sub- 



divided populations (see, e.g. , iNotoharal Il990t [Takahata 
ll99lllNei and TakahataLll993tlDonnellv and Tavarel ll995 



Q 



Wak elev . 19981 : Wilkinson-Herbotsl . 1998 ; Notohara , 2001 ; tant fixing with near certainty. The distinction is that in 



Charlcsworth et al. 



Wakelevl . 120011 

20061) . The main benefit of this approach is that lineages 



that do not contribute to the final state of fixation are 
explicitly excluded from the mathematical description. 
However, a fixed initial condition does not enter naturally 
in this description, and so some work is needed to match 
this up with the distribution of ancestors of the present day 
state of fixation. As we have already remarked, population 
subdivision is harder to handle in diffusion-equation-based 
approaches, although the initial condition is more easily 
enforced and extensions to alleles with a selective ad- 
vantage can be treated more straightforwardly (see, e.g. , 
Maruyamal Il970bl; ISlatkinl . Jl98ll ; iBartonl Il993t IWhitlockl . 
200 21 ICherrv and WakelevmOOa [Role and RoussetJ . 1200a 
Wakelev and Takahashil . 120041 ). 



In the next section we will present the details of the 
derivation of the exact formula relating forward- and 
backward-time properties to one another, and demonstrate 
the simplifications that occur in the limit of extemely slow 
migration. The rest of this work is then devoted to conse- 
quences of this formula under different models of subdivi- 
sion. It is worth at this early stage to fix the basic ideas 
underlying the approach by considering the simple case of 
fixation probability as a function of initial condition. Con- 
sider therefore the probability that a mutant allele fixes 
when initially a fraction Xi of the individuals in deme i 
are mutants. We denote this probability P*(A), where at 
this stage A is a shorthand for the initial condition (we 
will define it more formally below). If we look infinitely far 
forward in time from this initial state, fixation of either the 
mutant or the wild type is guaranteed to have occurred. 
Then rewinding infinitely far from the final state back to 
the initial state, we find a single ancestor in deme i with 
probability Q* as previously discussed. Since the initial 
assignment of mutants to individuals in the population is 
completely independent of the ensuing population dynam- 
ics, this ancestor is a mutant with probability Xi- Hence, 
the fixation probability is simply 



P*(A) = 



L 

£: 



Thus to find the fixation probability from any initial condi- 
tion, it is enough to know the distribution Q* which is the 
stationary distribution of the Markovian migration process 
running backwards in time. Since finding this stationary 
distribution is a fairly standard problem in the theory of 
stochastic processes, we shall not consider it in further de- 
tail here, other than to remark that through variation in 
the migration rates, it is possible to construct models in 
which Q* for some deme i is close to one. If selectively neu- 
tral mutants initially completely occupy that deme they 
are then almost guaranteed to spread to the whole popula- 
tio n. This is similar i n spir it to the phenomenon discussed 
by iLieberman et "ah (2005), in which it was seen that the 
spatial structure of the population could give rise to a mu- 



20031 : Wakelev and Lessardl that work, the mutant was considered to have a selective 

advantage and thus that the initial location played only a 
minor role in the probability of fixation. Under neutral ge- 
netic drift, the situation is reversed: the same mutation can 
fix or go extinct with near certainty depending on where 
it occurs. This will be demonstrated explicitly in Section 5 
below. 

Before this, in Section 3 we derive from the exact for- 
mulae presented in Section 2 an expression that holds for 
arbitrary population subdivision and gives the mean fixa- 
tion time from an initial condition in which mutants are 
randomly distributed with an overall frequency x. This 
would be the appropriate initial condition to use, for ex- 
ample, when modelling a historical situation for which ini- 
tial mutant frequencies, but not their location, are known. 
We learn that the resulting expression involves the mean 
times to particular coalescence events from the final state 
of fixation, which need then to be obtained either analyt- 
ically or by Monte Carlo sampling. One cas e in which the 
former is possible is Wright's is land model (jWrig b3. ll93lt 
Maruvamalll970al Latter . [l973h ; furthermore, the formula 
can be extended to an arbitrary initial condition. There- 
fore, in Section 4 we are able to perform a number of exact 
calculations for this very well-studied model, apparently 
for the first time. For example, we can consider in addi- 
tion to the initial condition that has mutants initially ran- 
domly scattered across all islands with some overall mutant 
frequency x the contrasting case where the same number 
of mutants are all initially confined to the smallest possi- 
ble number of islands. In genetics terms, these two differ- 
ent initial conditions correspond to the two extreme values 
of the inbreeding coefficient Fst = and 1 respectively. 
For the island model, we show that the difference in fix- 
ation time for these two initial conditions is short on the 
timescale of fixation which grows linearly with the num- 
ber of demes. We further provide evidence that fixation 
from any non-random initial condition is slower than from 
a random initial condition. For more general models, ex- 
act calculations are much more difficult. Nevertheless, we 
introduce a formula for fixation time from a single muta- 
tion with known location (a case of practical interest) to 
the statistics of the most recent common ancestor of the 
whole population. This will be explained in Section 5. By 
combining analytical results with numerical data from sim- 
ulations we explore fixation times, and their variation with 
initial condition, in two concrete and contrasting models 
in which there demes fall into clusters within which migra- 
tion occurs at different rates than between demes from dif- 
ferent clusters. Our interest here is in how fixation times 
grow with increasing number of demes, either by adding 
more demes to each cluster, or by adding clusters of fixed 
size. Whilst the former case has been considered in a num- 



(7) 



ber of works (such aslWakeleyl.l2001tlWakelev and Aliacar . 
20011: IWakelev and Lessardl . 12006). we obtain in this work 
some evidence that in the latter case, the mean time to fix- 
ation can remain finite in the limit of an infinite number 



A 



of demes. This effect — and the general growth law for the 
fixation time in a subdivided population — is predicted by 
the effective size formula (5) in the slow migration limit. In 
all cases other than that in which fixation in a finite time 
is seen, the coefficient of the growth law is also accurately 
predicted by (5). We discuss possible origins for the dis- 
crepancy that remains, along with implications of our find- 
ings for more plausible models of social interaction and mi- 
gration between biological populations in the conclusion, 
Section 6. 



2. Fixation probability as a function of time 

We begin by deriving an expression for the distribution 
of mutants that is valid for any model of population subdi- 
vision and arbitrary initial condition. We do however stip- 
ulate that under the population dynamics that an individ- 
ual's chances of leaving offspring in the next generation are 
unaffected by whether it is a mutant or not (i.e., there is 
no selection), and that subpopulation sizes and migration 
rates are constant in time. 

As stated in the introduction, we consider a population 
that is subdivided into L demes with iVj individuals in deme 
i. The initial condition, denoted A, is specified by the num- 
ber of mutants di in each deme, i — 1, 2, . . . , L. Two dis- 
tinct distributions are central to this work. The first is the 
probability P(B\A; t) that after t generations of reproduc- 
tion within and migration between demes, all descendants 
of A form the set B, i.e., a distribution with precisely bi 
mutants in deme i. The second is the probability Q(C\D; t) 
that all ancestors of a present-day sample D formed t gen- 
erations previously the set C . The quantities di and Cj are 
defined analogously to a, and bi for these sets. 

These two distribut ions are conn ected by a duality rela- 
tion that was given bv lMohlel ( 2001 ) for the panmictic case, 
L = 1. The generalisation to L > I is found by recalling 
that — in the absence of selection — the assignment of mu- 
tant alleles to individuals within the population is a pro- 
cess independent of the population dynamics. We proceed 
by constructing equivalent expressions — one for P and one 
for Q — for the probability of the event that a set of individ- 
uals A contains all ancestors of some set of individuals D 
present in the population t generations later. To be clear, 
the set A must contain at least the ancestors of individu- 
als in D, and may additionally contain ancestors of indi- 
viduals outside D, or individuals that have no descendants 
at the later time; likewise, descendants of A may form a 
superset of the individuals in D. Therefore, we obtain the 
desired probability by summing P(B\A;t) over all sets of 
individuals B that contain D; meanwhile Q(C\D;t) must 
be summed over all sets C that are contained within A. In 
both cases, the sum must be weighted by the probability 
that one randomly chosen set is contained within the other. 
The expression that results is 
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In this work, we will find Q(C\D; t), or quantities derived 
from it, either analytically or numerically, and use this du- 
ality relation to derive new formulae relating to fixation. 
This requires us to to rearrange the implicit expression (8) 
so that the forward-time probability P (that relates to fix- 
ation statistics) is given explicitly in terms of Q. This is 
achieved by making use of the identity 



j 



(-1) 



i+j . 



E(-d 



k 



(9) 



in which 5i t k is the Kronecker delta symbol. Here, the first 
step is achieved by expanding the binomial coefficients in 
terms of factorials, making some cancellations and recom- 
bining remaining factorials as binomial coefficients. Then, 
if k < i, the sum is zero by definition; if k > z, the alter- 
nating bin omial coefficients sum to zero (see e.g., formula 
(0.15.4) in lGradshtevn and Rvzhik[|2000[) ; if k = i one can 
easily verify that that the resulting expression equals unity, 
as required. Multiplying both sides of (8) by 



di 



and summing over all bi and di one finds 
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Q(C\D;t) (10) 



after dropping the prime on the variables bi that remain. 
For a final state of fixation, bi = Ni and this expression 
simplifies to 



(ii) 



in which we have introduced the notation P(A; t) for the 
probability of fixation from an initial distribution of mu- 
tants A, and Q(C; t) for the distribution of ancestors of the 
entire population a time t previously. Although we are con- 
cerned only with fixation in this work, we remark that (10) 
has wider applicability and could form the basis of further 
studies of genetic drift in a subdivided population. We also 
note — although we have no use for it here — that it is possi- 
ble to write down a similar explicit formula for Q in terms 
of P by making use of the identity (9). 

In the remainder of this work we specialise to the slow 
migration limit for the reas ons outlined in the introduc- 
tion. In particular we know (jNagvlakil . |1980| ) that if /Jij is 
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the probability that an individual sampled at random from 
deme i had a parent from the previous generation in deme 
j, the overall population behaves like a panmictic popula- 
tion with an effective size given by Eq. (6) if one has 

lim Nfiij = oo (12) 

N^oc 

where N is the mean deme size. Since we are most interested 
in those cases where deviation from panmixia is most likely, 
and genetic diversity is maintained for the longest periods 
of time before the onset of fixation, we shall insist that 



lim Nfiij — TUij < oo 



N- 



(13) 



holds for all i and j. In this limit it is convenient to work 
in rescaled time units, so that that one unit of time corre- 
sponds to N generations of the underlying population dy- 
namics. These rescaled time units will be in force until the 
end of this work. 

In these rescaled time units, the backwards-time hopping 
of lineages becomes in the limit N — » oo a continuous- time 
process in which a hop from deme i to j occurs with rate 
m,j. Meanwhile, pairs of lineages collocated in deme i co- 
alesce at a rate N/Ni which is assumed to remain finite in 
the limit N — ► oo. The fact that both of these processes oc- 
cur on the same timescale makes calculation of Q(C; t) in 
general difficult, since coalescence and migration events are 
intermingled. To simplify the calculations, we shall further 
assume that all the migration rates mij are proportional to 
some vanishingly small parameter m. This defines an ex- 
treme slow migration limit that we shall henceforth assume 
is in operation. Then, we expect that the probability that 
a migration event occurs in the time it takes all lineages 
that are present in a single deme to coalesce vanishes with 
m, even in the limit of an infinite system. This expecta- 
tion is based on the fact that the fixation time in an ideal 
population is of order one (in the rescaled time units), one 
can choose m sufficiently small that the probability that 
all lineages coalesce before any of them migrate elsewhere 
with arbitrarily high probability. We note that for large, 
but finit e, populations th is assumption has been used previ- 
ously bv lTakahat al (Il99ll) and formal ly proved bv lNotohara 
(120011) : furthermore, [GriffithsJ (Il984^ gives grounds to be- 
lieve that in an infinite population, only a finite number of 
lineages remain at any nonzero (rescaled) time. Given that 
this is the case, the probability for there to be more than 
one ancestor in any deme at a time of order 1/m vanishes, 
and hence all the sums over Ci in (11) can be truncated at 
Cj = 1. This yields for the fixation probability as a function 
of time the expression 



P(A-t) = H^x, Ci Q(ci,.,ci;t) 



(14) 



= 1 c i= 



which is the fundamental equation that will be used repeat- 
edly in this work. Here, Xi = a i/^i is the initial fraction 
of the individuals in deme i that are mutants and by con- 
vention we take Xi = 1 whenever Xi = 0. Any timescales 
calculated from this expression will thus be proportional to 



1/m, and it is to be understood that any times T quoted 
in the following are believed to have the property that the 
combination mT is exact in the limit of extremely slow mi- 
gration, m — ► 0. 

We remark that the formula for the ultimate probability 
of fixation, Eq. (7) , then follows by taking the limit t — > oo 
(which is indicated by the asterisks on P and Q). In this 
limit Q(ci, . . . , Cl) t) is zero for any state with more than 
one ancestor; and converges to Q* for the state with a single 
lineage in deme i. Since lineages hop from deme i to deme 
j going backward in time, this distribution is given by the 
solution of the balance equations (3) recast in terms of the 
parameter m^, viz, 



E Q*i mi - 
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1,2, 



(15) 



It is a ssumed — as is customary ( Nagylaki , 1980t Whitlock and Barton , 
1997) — that the stationary solution is unique. One way 
that this can be assured is if it is possible for every deme to 
be reached from any other through some sequence of mi- 
gration events, as is well known from the theory of Markov 
chains (see e.g.. lGrimmett and Stirzakerl 120011 ). 



3. Fixation from a random initial condition 

Our first application of the formula (14) concerns a ran- 
dom initial condition A x in which a fraction x of all individ- 
uals are mutants, but are spatially distributed uniformly 
at random. The probability of having oi, <i2) • • • mutants in 
demes 1, 2, ... is then given by 

L 
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/ LN_ N 
\xLNi 
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N, 



The probability of fixation by time t from this random con- 
dition is obtained by summing the right-hand side of (14) 
over all initial conditions A that have \A\ = Ej «i = xLN, 
weighted by the previous expression. In this sum one en- 
counters the combination 



A i=l 



'\A\,xLN ■ 



This can be evaluated by noting that it is the coefficient of 



cLN 
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»=i 



in the power series expansion of 
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{l + zy l ) N ^=\{z c ^l + zy l ) 



(16) 
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evaluated at y% = y% = ■ ■ ■ = ijl = 1. This reveals the 
fixation probability P r (x; t) from a random initial condition 
to be 



P(A x ;t)=l[J2 

i=i c i= o 



xLN- 



( LN -) 

\xLNJ 



-Q( Cl ,...,c L ;t) . (17) 



Note now that the combinatorial factor appearing in the 
sum depends only on the total number n — E c j of ances- 



tors of the entire population that remain after going back- 
wards a time t from the present day, and not their location. 
Since, for any finite n, we have 



/ LN-n \ 
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(18) 



we find in the limit of infinite subpopulation sizes (but fixed, 
finite deme number L) the simple expression 



P(A x ;t) 
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x n Q(n;t) 



(19) 



for the fixation probability, in which Q(n; t) is the probabil- 
ity that the entire population had precisely n ancestors at 
a time t prior to the present day. Comparison with Eq. (14) 
reveals that this random initial condition gives the same 
fixation probability as a homogeneous initial condition, in 
which every deme contains a fraction \i = o-i/Ni — x mu- 
tants. We also see that since lim t _ 00 Q(n; t) = 5 Uy i, the ul- 
timate fixation probability P*{A X ) = x: that is, no matter 
what the migration rates are, the fixation probability for 
a randomly distributed set of mutants is always equal to 
their initial overall frequency. 

Since the probability that fixation occurs in the time 
interval [t, t + dt] is -^P(A; t)dt, the mean time to fixation, 
averaged over those realisations where fixation does occur, 
is 



t(A) = 



P 



For the case of the random initial condition, 



t(A x ) 



T,n=x^^tf t Q{n;t)dt 



(20) 



(21) 



Noting that in any realisation of the dynamics, the n- 
ancestor state is entered or exited exactly once (except for 
the n — 1 state which is never exited), this expression can 
also be written as 

L 



T(A x )=T 1 + J2x n ~\T n -T n - 1 ) 



(22) 



in which T n is the mean time at which the state with n 
ancestors is entered, going backwards in time from a state 
with one lineage per deme (i.e., the whole population in 
the extreme slow migration limit). We reiterate that this 
expression, believed not to have appeared before, is valid 
for any set of migration rates (at least, in the extreme slow 
migration limit) and further remark that when the coales- 
cence times T n cannot be obtained analytically, they can be 
easily estimated by Monte Carlo sampling of the genealo- 
gies. 

4. Fixation in Wright's island model 

A model for which fixation times can be calculated ana- 
lytically for any initial condition is Wright's is land model 
((Wrightl . Il93ll : iMaruvamal . Il970at iLatteri . 1 19731) . This pro- 
vides one case where variation of fixation time with initial 



condition can be fully explored, and comparison made with 
the result for a random initial condition via the result of 
the previous Section. These new results provide more de- 
tailed information about the island model in the slow mi- 
gration limit than the formulae f or the mean and variance 
of coalescence times provided bv lTakahata ( 1991 ). 

In this model of population subdivision, migration occurs 
at a uniform rate between every pair of denies, and all 
dcmes have the same size 2Vj = N. In order that results for 
different numbers of demes can be compared, we insist that 
mean number of individuals replaced in each generation is 
independent of the number of demes L. We therefore set 
rriij = m/(L — 1) in which m defines an overall timescale 
as observed in Section 2. 

Analysis of the model is relatively straightforward be- 
cause the statistics of the genealogies are invariant under 
relabelling of the demes. Therefore, the distribution of an- 
cestors Q(C;t) depends only on the number of lineages n 
that are contained within C and not their location. The 
fixation probability (14) can thus be written as 

L 

P(A;t) = Y / r n (xi,---,XL)Q(n;t) (23) 

n=l 

where Q(n; t) is as defined in the previous section and the 
coefficients T n {xi, ■ ■ ■ ,Xl) are proportional to the elemen- 
tary symmetric polynomials in \'- 
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(24) 

The ultimate fixation probability P*(A) — Fi(xi, ■ • • ,Xl)i 
as can be seen by comparing with Eq. (7) . Using this in the 
denominator of (20), we find the mean time to fixation is 

L 



t(A) = 71 + 



r ra (xi, • ■ • iXl) 
^ r!( X i,...,XL) 



(25) 



Note the similarity with Eq. (22) given in the previous Sec- 
tion for the case of a random initial condition. Here, an ex- 
pression in terms of entry times T n is possible only because 
all states with n ancestors are equiprobable at a given time 
in the history of the dynamics. 

The entry times themselves can be calculated easily be- 
cause, as in an ideal population, the rate of coalescence from 
a state with n lineages to one with n— 1 lineages is a Poisson 
process that occurs with a rate proportional to n(n — 1). In 
the slow migration li mit, the constan t of proportionality is 
m/(L — 1) (see, e.g.. lTakahatal . ll991l ). whereas in an ideal 



population with n <C N, it is i. The mean time spent in 
the n ancestor state is then simply 

L - 1 



T n -i — T n — 



(26) 



mn(n — 1) 

As the initial condition comprises n — L lineages, we have 
that Tl = 0, and therefore 

T i = E ( T «-i - T «) = ^-t) ■ (27) 
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The mean time to fixation from a random initial condi- 
tion A x is then obtained by using these expressions in (22). 
We find 



t(A x ) 



L - 1 



L - 1 



1 - 



L J ^ \n n-l 



n=2 



L — X-n T 
, . V — A X X Jb 



71=1 



71 



(28) 
(29) 



When the number of islands L is large, we can expand the 
term in square brackets as a series in 1/L to obtain 



t(A x ) 
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(l-i) ln(l -x) 
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(30) 

which can be compared to (1), taking into account that in 
this expression time is measured in units of N generations, 
whereas (1) measures time in generations alone. 

Since, as previously described, the random initial con- 
dition is equivalent to a homogeneous initial condition 
where a fraction x of the individuals in each deme are 
mutants, the most interesting comparison is with a highly 
inhomogeneous initial condition A x that has the same 
number of mutants. This is attained by having a fraction 
x of the demes containing only mutants, and the rest 
containing only the wild type. For such a distribution, 

r„(xi, ...,7a) = { L n)/0- Wc thcn find that thc mean 

time to fixation is 



t(A x ) 
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(31) 

To simplify this sum, it is convenient to replace the quantity 
in thc final set of round brackets with (— — i) — (—^-r — ~). 
Collecting terms, one eventually finds after some rearrange- 
ment that 
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(32) 
(33) 



in which the second line was obtained from the first by 
expanding out the binomial coefficients and further rear- 
rangement. For large L, this expression can be written as 



r(A x ) 
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(1 - x) ln(l - x) 
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(34) 

We see that both initial conditions yield in the limit L —* 
oo the same functional form as the classic result (1) for an 
ideal population, up to a change of timescale. Taking into 
account that (1) gives the fixation time in terms of a num- 
ber of generations, we find the effective population size of 
a single panmictic population to be N e = N(L — l)/2m; in 
what follows a useful measure will be the effective popula- 
tion size a e = N e /N relative to the mean size of a single 



deme, since this remains finite in the limit N —> oo. This 
correspondence between fixation times can be understood 
by the fact that, as described above, the coalescence pro- 
cess at the level of demes in the slow migration limit is 
precisely the same as that for an ideal population, albeit 
on a longer timescale. It is perhaps interesting to observe 
that whilst (1) was obtained using forward-time diffusion 
equations, these new results have been determined entirely 
within the backward-time coalescent formalism thus pro- 
viding an explicit demonstration of the equivalence of these 
two complementary approaches. 

For large, but finite L, the difference between the fixa- 
tion time for the inhomogeneous and homogeneous initial 
conditions (A x and A x respectively) is 

t(A x ) - t(A x ) ~ -L + OiL- 1 ) . (35) 
2m 

As this difference is small on the timescale of fixation, which 
grows linearly with the number of demes L, we suggest 
that an inhomogeneous initial condition relaxes quickly to 
a homogeneous state. This state, which has a frequency 
x of mutants in each deme, would then persist for a time 
proportional to L. This accords with what one observes in 
a Monte Carlo simulation of t he forward-time population 
" 20061) . 
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We conclude this Section by arguing that the homoge- 
neous (or random) initial condition gives the shortest fixa- 
tion time of all possible distributions that have an overall 
fraction x of mutants in the population. To show this, one 
needs first to impose the constraint ^ i \i — Lx by set- 
ting xl — xL — X^jLi 1 Xi- Then, one finds the extremum 
of t(A) by setting all the derivatives of the right-hand 
side of (25) with respect to the independent parameters 
Xi) X2, • • • jXl-i to zero. It is a straightforward, but te- 
dious, exercise to show that -^-T n (xi, ■ ■ ■ ,Xl) is propor- 
tional to xl — Xi with a non-negative constant of propor- 
tionality, except for the case n = 1 where thc derivative 
vanishes due to the constraint. All derivatives of (25) thus 
vanish when all the fractions Xi are equal, which corre- 
sponds to the homogeneous initial condition. Although this 
demonstrates only that this is an extremal fixation time, 
the fact that the fixation occurs more slowly from an inho- 
mogeneous initial condition (at least for large L) is sugges- 
tive that the extremum is a minimum, particularly since 
the analysis just outlined also shows that the point Xi = x 
(for all i) is the only extremum in the interior of the (L — 1)- 
dimcnsional space of independent parameters Xi- 

5. Fixation from the first mutation event 

For models of migration that have more structure than 
Wright's island model, calculation of the fixation time from 
an arbitrary initial condition is much more difficult. We 
thus specialise to the comparatively simple case of an initial 
condition that has a single mutant in the entire population. 
Our approach is to assume that certain statistical proper- 
ties of the most recent common ancestor (MRCA) of the 



whole population are known, e.g., from an explicit calcu- 
lation or Monte Carlo sampling. Then, starting from (20) 
we derive a new formula for the fixation time from a single 
mutation as a function of its location. We first present this 
derivation, and then illustrate its implications through two 
explicit and contrasting models of population subdivision. 

5.1. Relation between MRCA statistics and fixation time 

If we have an initial condition A{ in which a fraction \ 
of the individuals in deme i are mutants, and all others in 
the population are the wild type, we have, for arbitrary x, 
the mean fixation time 



Ti = T{Ai) = 



(36) 



in which Qi(t) is the probability that, a time t prior to the 
present day, the entire population has a single ancestor that 
is located in deme i. To make a connection with the MRCA, 
we introduce three quantities: first, the probability density 
Tj (t) for the single ancestor state to be entered in deme j at 
time t; second, the integral of this quantity R* — / °° rj (t)dt 
which gives the total probability that the MRCA is in deme 
j; and finally Qji(At) for the single ancestor of the whole 
population to be in deme i a time At after it was in deme 
j going backwards in time. With these definitions, we then 
have that 

= E f te'rj{t')Qii{t - t') • (37) 
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The numerator of (36) is then 
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The double integral in this expression can be written as 
J dt? j dt(t' + t)r j (t')-Q ji (j;t) = 



/ dt't' rj (t')[Q* - QjiiO)} + R* 
Jo Jo 



tj t Q 3i {t)dt . 



(39) 

Two simplifications now occur: first, the term containing 
Qji(0) cancels that in (38); second, we have that 
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(t) dt = T 1 , 



(40) 



since the total probability density for the MRCA to be 
found at time t is the sum ^ . rj(t). The expression (38) 
consequently reduces to 

f°° d L f°° d 

y t-Qi(t)dt = Q*Ti + ^ R* ^ t-Q ri {t)dt . (41) 



To attack the remaining integral we diagonalise the LxL 
matrix M that has elements 



Ma = 




(42) 



and is the generator of the Markov process the describes 
the backward-time hopping in the single-ancestor state. 
This matrix has L eigenvalues, one of which Ai is zero 
and corresponds to the stationary state: the left and right 
eigenvectors have elements [ui]j = Q* and [vi]j = 1 re- 
spectively. The remaining eigenvalues A2 , . . . , Xl all have a 
strictly negative real part, since the stationary state is by 
assumption unique. The left and right eigenvectors satisfy 
the biorthogonality relation u„ • v m = S n ^ m . 

The probability density Qji (t) can then be written as 

L 

Quit) - [c Mt ]^ = QJ + E e An *[u„] i [v„] j (43) 

n=2 

in which the stationary solution has been separated out for 
clarity. We can now evaluate the integral in (41), 

r4 f Q d t)dt = ±^k, (44) 
Jo dt n=2 A n 

which allows us finally to write down an expression for the 
fixation time Tj originally given by (36). It reads 
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(H*-v„) 



(45) 



where R* is the (row) vector of probabilities for the location 
of the MRCA. Thus the problem of calculating the mean 
time to fixation from the first mutation event in deme i 
is reduced to that of determining the mean time to the 
MRCA, Ti, its spatial distribution R* and the eigenvalues 
and eigenvectors of the LxL matrix of migration rates M . 

Before applying this formula to concrete models, we re- 
mark that should the MRCA distribution R* and station- 
ary distribution Q* coincide, = T\ for all i. This is be- 
cause then R* is the zero left eigenvector of M, Uo, and the 
scalar product R* ■ v„ vanishes by the biorthogonality of 
the eigenvectors. One situation where this occurs is when 
any pair of demes can be exchanged without affecting the 
dynamics, as in Wright's island model: then both R* and 
Q* are uniform. 

In general, the difference between Tj and T\ will be non- 
zero. Furthermore, by summing (45) weighted by Q*, or 
taking the limit x — > in Eq. (22), the mean time to fixation 
from a randomly located mutation is equal to T\. Hence, Tj 
can be larger or smaller than Tl , a result which at first sight 
seems counterintuitive, but can be understood from the fact 
that one averages over all realisations of the dynamics to 
find the mean time to the MRCA, but only over a restricted 
subset in the case of fixation. 



n 



5.2. Two example applications 

We now determine fixation times in two concrete models 
of population subdivision undergoing extremely slow mi- 
gration. The first is similar to Wright's island model, in 
that migration is permitted between every pair of demes. 
However, migration occurs at one of two rates, depending 
on whether a pair of demes are considered to belong to the 
same, or different, clusters of demes. The second model has 
a further restriction, in that migration between clusters can 
only occur if one of those clusters is a special central cluster. 

We will compare the data obtained with predictions for 
the fixation time from (1) in the limit x — ► and using 
the asympto tic effective siz e given by Eq. (5). As has been 
established ( Slatkin , Il99lh the limiting ratios of identity 
probabilities appearing in (5) can be replaced by ratios of 
mean coalescence times Yy for two individuals, one sampled 
from cluster i and one from j. Specifically, one obtains 



N, 



yL 



wSQ* 



l Y 

J- 1.1, 



(46) 



In the foregoing we have assumed that, on the timescale 
of migration, the rate of coalescence between pairs of lin- 
eages located in a single deme is infinitely fast. However, 
we clearly cannot simply set Yu = in the previous expres- 
sion to obtain an estimate of the fixation time; instead, we 
must take into account that coalescence occurs at a fast, 
but finite rate. 

To this end, let use return to the original time units 
where there is a finite mean subpopulation size N, and 
the population evolves in discrete time. We will take the 
size of subpopulation i to be Ni = a,iV, and migration 
probabilities fiij — (mi/y)/iV. We recall that the extreme 
slow migration limit (and a continuous-time dynamics) is 
reprised by first taking N — > oo and subsequently m — > 0. 
The mean coalescence times are then given by the solution 
of the set of linear equations 
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(47) 



We anticipate that the leading contribution to the co- 
alescence times Yu between pairs of lineages in the same 
deme grows as N in the limit N — > oo and m — > 0, whilst 
Yij grows as N/m in this limit. We can thus write the exact 
solutions to (47) as power series in the (small) parameters 
1/N and m: 



Y U = N{ Y<f ] + 0{m) + 0(1/ N) 
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MP + O (m) + O ( 1 / N) ) for i ^ j 



(48) 
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Substituting these expansions into (46), and taking the 
limit N — > oo followed by m — > yields 



a e = lim lim 

m->0jV->oo N 



for the effective size of the entire population a e relative to 
the mean subpopulation size N. Thus we need only deter- 
mine the leading coeffecients in the series expansions. From 
(47) we find linear equations satisfied by these coefficients 
by again taking the limit N — > oo followed by m — ► 0. For 
the case i ^ j, we have 
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(51) 



which are precisely the expressions obtained if one approx- 
imates the coalescence process as one that occurs at an in- 
finite rate. For the case i = j, however, one finds the finite 
result 



Y^ = a t 
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(52) 



Substituting this expression into (50) we find the formula 

0e = ^SiSEl (63) 

for the relative effective population size. Note that this ex- 
pression, valid in the slow migration limit, depends only on 
fixation times between pairs of lineages in different demes, 
and the deme sizes on do not enter. Finally, using (1) in the 
limit x — > 0, and returning to rescaled time units (i.e., those 
that have one unit of time corresponding to N generations 
of the population dynamics), we arrive at an estimate of 
the fixation time r e that behaves as 

T e (54) 

m 

in the limit m — > 0, with a e given by (53) in terms of 
the solutions to (51). It is with this estimate that we shall 
compare data for the two different concrete models. 

5.2.1. Two-level model 

Our first concrete model has I equal-sized clusters, ni = 
n. Every deme receives a fraction y of its migrants from 
demes within its own cluster, and the remaining fraction 
1 — y from demes in other clusters. So that this fraction 
is well defined, we will insist that n > 2 in all cases. We 
will also have the total rate of migration into deme i, rtii = 
rriij equal to the small parameter m in each deme. These 
considerations specify the parameters i/y appearing in the 
migration rates my = mi/ij as 
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i,j from same cluster 
i,j from different clusters 



(55) 
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We remark that this hierarchical versio n of the island model 
has p reviously been considered by ISlatkin and Voelni 
( 19911 ). It has the special property that the dynamics are 
unaffected by exchanging any pair of demes, and so the 
distribution R* of the MRCA is uniform; since Yljjti m ij = 
(50) Sj^i m ji we a l so nave from Eq. (15) a uniform station- 
ary distribution Q*. Therefore Tj does not depend on i, 
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Fig. 1. Comparison of numerical calculations of the time to the 
MRCA T\ in the two-level model with L = 84 demes and two 
different cluster sizes, n = 7 and 12. The curved lines show data 
from a numerically exact solution, and the points the corresponding 
averages over 10 5 genealogies generated by Monte Carlo sampling. 
The sampling error on the latter are smaller than the symbol size. 

and is thus equal to T\. Note also that the special value 
y = (n — l)/(L— 1) corresponds exactly to Wright's island 
model, and for larger values of y one has faster migration 
between demes within the same cluster than between those 
in different clusters. 

We plot in Fig. 1 the combination mXi/2 as a function of 
y obtained using two different numerical methods: this com- 
bination then gives an empirical definition of a e through 
Eq. (54). The approach is to solve the linear equations (47) 
extended to states comprising more than two lineages (th e 
details of this method are given by, e.g., Notohara . l200ll ); 
the second method is Monte Carlo sampling of the ances- 
tries. In turns out that the former approach is computa- 
tionally tractable only up to L w 80, and so the latter is 
preferred for larger system sizes. At the smaller values of L 
where both approaches are possible, we see that the data 
are — up to numerical errors — indistinguishable. Although 
not particularly evident from the figure, it turns out that 
Ti, and hence the fixation time, is always shortest for that 
value of y that corresponds to Wright's island model. One 
also sees a divergence in T\ as y — > 1, since then inter- 
cluster migration is prohibited. 

We now investigate the growth of fixation time with L 
in the regime where intra-cluster migration is (at least for 
sufficiently large L) faster than inter-cluster migration, and 
compare the effective number of demes so obtained with the 
predictions of (53). Of the many possible ways in which the 
limit L — > oo can be taken, two are of particular interest to 
us. In the first, the number of clusters £ is held fixed whilst 
the number of demes in each cluster n goes to infinity; the 
the second, the cluster size n is held fixed as more and more 
clusters are added. 

Since the steady state is uniform, Qf = we find from 
Eq. (53) that 



(n - l)Y s + n(£ - l)Y d 
1 + 2yY s + 2(1 - y)Y d 



Fig. 2. Mean time to the MRCA (and hence to fixation from a single 
mutant) normalised by the prediction for the effective number of 
demes for the two-level model as a function of system size with fixed 
cluster number i = 12 and for y = 0.4,0.6,0.8. The points show 
numerical data, with errors of the same order as symbol size. The 
solid lines are empirical fits mT/2a e = A + BL In all cases A fa 1. 

where Y s = for a pair of demes i,j belonging to the 
same cluster, and Y d the corresponding quantity for two 
demes from different clusters. The equations (51) become 
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Substituting the solutions into (56) and taking the limits 
of interest one finds that 
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(n -[!-»])(!-!,) 



oo, fixed n 



(56) 



(59) 

That is, in both cases, the relative effective population size 
a e grows linearly with L asymptotically. This behaviour is 
seen in the numerical data shown in Fig. 2 (for constant 
£) and Fig. 3 (for constant n) in which the combination 
mT 1 /2a e is plotted, with the predictions for a e given by the 
above formula?. Also shown in the figure are fits to the data 
mTi/2a e = A + BL f , a value of A w 1 then indicating a 
correct prediction for a e . We see that in for both cases of 
fixed cluster number (£ — 12) and fixed cluster size (n = 
12), the asymptotes for three values of the parameter y = 
0.4,0.6 and 0.8 are all consistent with a value of 1, thus 
demonstrating the accuracy of the predictions of (53) in 
these instances. 

5.2.2. Hub- and- spoke model 

In the second example model, we also divide the demes 
into £ clusters of equal size n > 2, but this time one of 
the clusters is a special hub deme with migration between 
clusters permitted only if one of the two clusters is the 
hub. The remaining clusters thus form spokes — see Fig. 4. 
This model is intended to reproduce an effect noticed in so- 
cial networks, in which some members of a society interact 
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Fig. 3. As Fig. 2 but for the case of constant cluster size n = 12. 
Again, the solid lines are fits of the form mT/2a e = A + BL 1 with 
asymptotes A Ri 1. 
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Fig. 4. Cartoon of the hub-and-spoke migration model. The demes 
are divided into t clusters, within which migration occurs between 
every pairs of demes. Migration between demes lying in different 
clusters is allowed only if one of those demes is in the central hub; 
spoke-to-spoke migration is prohibited. 

more widely than others; in a biological context, one could 
perhaps interpret this model as reflecting a continental- 
archipelago formation, but with a structured population on 
the continent. Either way, we introduce three independent 
migration rates: ttiq — ttivq for migration between demes 
within the same cluster; m; ls = mvhs for migration from a 
spoke deme to a hub deme (going forwards in time) ; and the 
rate m s h — mi'sh f° r migration in the opposite direction. 

There are at least two ways in which one can make a 
meaningful comparison with the results of the previous sec- 
tion. First, one can impose a uniform overall rate of immi- 
gration m, with each deme receiving a fraction y of immi- 
grants coming from within the cluster, and the remaining 
fraction from outside the cluster. In such a case, the pa- 
rameters appearing in the migration rates are 



Vhs 



v = 



hs 



y 



(») 



n — 1 

n(i-l) 
1-1/ 



(60) 
(61) 
(62) 



Alternatively, one can impose a uniform emigration rate, 
with a fraction y of migrants from each deme remaining 
within the cluster, and the remainder going elsewhere. This 
rule is enforced by choosing v^ a = vf} = i// and v s h = 



sions below, they are valid for either rule once the appropri- 
ate expressions in terms of m, y, n and £ have been inserted. 
Note that the diagonal elements of the matrix of migration 
rates will be altered as a consequence of exchanging the 
off-diagonal elements so that probability is conserved; note 
also that for the two-level model, both rules are equivalent. 

One property of the hub-and-spoke model is that the sta- 
tionary state satisfies detailed bala nce, as one ca n show by 
using, e.g., a Kolmogorov criterion ( Kellv . [l979l ). This im- 
plies that ratios of stationary probabilities satisfy the rela- 
tion Q*/Q* = Vji/vij where i and j label demes. Introduc- 
ing Q* h for the total probability for the single remaining an- 
cestor of the entire population to reside somewhere in the 
hub, and its complement Q* = 1 — Q* h for the ancestor to 
be somewhere in the spoke, we have that (£ — 1)Q^/Q* = 
Vshjvhs and so for both rules 



Vsh 



Vsh + (£-l)Vhs 



(63) 



In particular, under uniform immigration one has = 
Q* s = \. However, since there are (for I > 2) more spoke 
denies than hub demes, a mutation occurring in the hub is 
[I — 1) times more likely to fix than one occurring in the 
spoke. Under the uniform emigration rule, the opposite is 
true, a mutation somewhere in the spoke being [I— 1) times 
more likely to fix. 

A further useful property of the hub-and-spoke model is 
that the diagonalisation of the matrix M that is needed to 
calculate — T\ via Eq. (45) can be done analytically. In 
fact, only one non-stationary eigenstate contributes. This 
state has 



A2 = — m [nv a h + n(£ - 
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(64) 
(65) 
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where the i th element of a column vector notated here as 
(h, s) T is h if i corresponds to a hub deme, and s otherwise, 
whilst the corresponding element of row vector (h, s) is h/n 
for a hub deme, and s/[n(£ — 1)] otherwise. Defined this 
way, the scalar product (h,s) ■ (h',s') T = hh' + ss' , and 
the stationary solution can be expressed as Ui = (Q h , Qt) 
which is orthogonal to V2, as required. With this notation 
established, it is easy to show that the row vector 



R* = (R* h ,R* s ) = m + [RlQt - R* s Q* h ) u 2 



(67) 



where Rh and R s are the total probability that the MRCA 
is somewhere in the hub or a spoke respectively. As a conse- 
quence of this relation, we must have R* • v„ = for n > 2 
by the biorthogonality of the eigenvectors. Therefore, only 
one term n = 2 in the sum in (45) contributes, as claimed. 

Evaluating (45) for the case of an initial mutation posi- 
tioned somewhere in the hub, one finds the mean time for 
subsequent fixation to be 
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Written this way, it is evident that when the MRCA is more 
likely to be in the hub than the single remaining ancestor 
in the stationary state, the mean time to fixation from the 
hub is less than that to the MRCA. 

Under the uniform immigration rule, the mean time to 
fixation from the hub is given by 



Th = T\ 



2Rt 



1 



2m(l-y) ' 

Meanwhile, when uniform emigration is enforced, 



(70) 



Th = Ti- 
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(71) 

The corresponding fixation times from a spoke deme can 
be found using the fact that Q* h Th + Qt T s = 2~i> as was 
mentioned at the end of Section 5.1 above. 

It remains to find T\ and R* h numerically; first, however, 
we obtain a prediction for the effective number of demes 
using Eq. (53). This is a more involved enterprise than for 
the two-level model, because there are four distinct ways to 
sample pairs of individuals from the population: both from 
the hub cluster (we denote this hh), both from the same 
spoke cluster (ss), one from the hub and one from a spoke 
(hs) and two from different spoke clusters (ss). The set of 
linear equations (51) then becomes 



Y hh = 
Y ss = 
Y hs = 

Y. M = 



1 + 2n(£ - \)v hs Y hs 
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1 + 2nv sh Y hs 
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Fig. 5. Mean time to fixation and the MRCA, normalised by the 
relative effective population size a e , in the hub-and-spoke model un- 
der the uniform immigration rule for y = 0.4, 0.6, 0.8 and increasing 
deme number L with fixed cluster number I = 12. The upward- 
and downward-pointing triangles show fixation times from the spoke 
T s and hub Th respectively, whilst the points in between are the 
MRCA times. The solid lines show fits to the latter of the form 
mTi/2a e = A + BL~ J ; in all three cases the asymptote Aral. 
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These can be readily solved using (for example) a computer 
algebra package. The formula for relative effective size (53) 
takes the form 

„ _ (Q:) 2 l(n-l)Y ss +n(£-2)Y s ,] + (e-l)Qll(n-l)QlY hh +2nQ* s 



e 2P[(£ - 1) + {£ - 2 fyX 

Simulation data for case of fixed £ = 12 are shown in Fig. 5 
for uniform immigration and in Fig. 6 for uniform emigra- 
tion. In both figures the combination inT /2a e is plotted for 
three different values of y (y — 0.4, 0.6, 0.8). In all cases fits 
to the data of the form A + BLr 1 reveal an asymptote A 
consistent with unity; that is, the data show an asymptotic 
linear growth with the slope predicted by the effective size 
formula (53). In both cases, the relative variation of fixa- 
tion time with the location of the first mutation vanishes in 
the limit L — > oc. That this should be the case can be seen 
from the formulas in Eq. (70) and (71); when £ is fixed, the 
correction term is bounded whilst T\ grows linearly L. 

A scaling of the fixation time that is nonlinear in L is seen 
when the number of demes is increased by adding more and 
more spokes of fixed size n. In this limit, the prediction for 
a e asymptotes to a constant 

[n-lf 5 1 



Y hs ] 



cv,. 



e {Ql) 2 [l+2{n-l)v Q Y ss +2nu sh Y hs \ + {l-l){Q» h Y[l+2(7i-l)v a Y hh +2n 

(76) 

We shall not present the full expression for a e here, as it is 
rather complicated and unrcvcaling. Instead, we shall show 
only how a e behaves asymptotically for large L, under the 
two immigration rules, and for the two contrasting limits 
of infinite system size discussed for the two-level model. 

In the case where the number of clusters £ is held fixed, 
and the number of demes within them n increased, the pre- 
dicted a e increases linearly with L under both the uniform 
immigration and emigration rule. Specifically one has for 
uniform immigration 



n-(l-y) 41-y 



(79) 
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under the uniform immigration rule, but increases quadrat- 
ically with L as 



1 



-L 



(80) 



2n 2 (l-y) 

as L — > oo. Plotting the quantity mT\j2a e for the case 
n = 12 suggests that this asymptotic quadratic growth is 
correctly predicted, shown by the approach to a constant 
value as L — > oo in Fig. 7. Again fits to the data suggest 
asymptotes consistent with unity, showing the accuracy of 
the prediction given by the effective relative population size 



Fig. 6. As Fig. 5, but for the uniform emigration rule. In this case, the 
mean fixation times from both hub and spoke are indistinguishable 
from the mean time to the MRCA, so only the latter has been 
plotted. The solid lines show fits of the form mT\/2a e = A + BL~ J , 
and show asymptotes A m 1 in all cases. 

(53). In Fig. 8, data for the uniform immigration rule with 
constant cluster size n = 12 are shown for the case y = 0.8 
(data for other y values show similar behaviour, but have 
been omitted for clarity). Even at the system sizes shown, 
the fixation time is still growing with L. However, the data 
are suggestive that eventually the fixation time will satu- 
rate to a constant as predicted by Eq. (53). First, a fit to 
the data of the form A + BL 1 suggests an asymptote of 
A w 1.38, that is, that the fixation time is bounded by a 
constant but one that is approximately 40% larger than 
that predicted by (53). (The asymptote for smaller values 
of y is also underestimated by (53), but not to such a great 
extent: the numerical data exceed the prediction by about 
10% for y = 0.4 and about 20% for y = 0.6). Further evi- 
dence for the fixation time remaining bounded in the limit 
L — > oo is provided by the fact that — uniquely among the 
models and limits considered — the relative differences be- 
tween the mean times to fixation and the MRCA remain 
finite in the limit L — > oo. As noted above, the absolute 
differences are bounded and so relative differences may re- 
main finite only if the fixation time saturates. 

6. Discussion and conclusion 

The aim of this work was to develop a better understand- 
ing of the effects of population subdivision on fixation un- 
der neutral genetic drift. This was achieved by exploiting 
a connection between forward- and backward-time proper- 
ties of neutral genetic drift which admitted the derivation 
of a number of new results for fixation properties in subdi- 
vided populations that are exact in the limit of cxtemely 
slow migration. Since the underlying motivation was partly 
to assess the viability of genetic drift as a mechanism for 
propagating a social change, we were particularly interested 
in establishing how fixation times grow with the number 
of demes. Furthermore, the existence of historical data re- 
quires knowledge of variation of fixation times with the ini- 



Fig. 7. As Fig. 5 but for the uniform emigration rule and in- 
creasing L as the cluster size is held fixed at n = 12. Here the 
points show numerical estimates of the mean time to the MRCA for 
y = 0.4, 0.6, 0.8; again the mean fixation times from hub and spoke 
are indistinguishable from these data. The solid lines are fits of the 
form mT/2a e = A + BL -1 , with A ss 1 in all three cases. 
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Fig. 8. As Fig. 5, but for the uniform immigration rule and increasing 
L as the cluster size is held fixed at n = 12. Here, the points show 
numerical data for the case y = 0.8; those for other y values are 
similar and have been omitted for clarity. The solid lines are fits of 
the form mT/2a e = A + BL~"< . 

tial distribution of mutants. 

We address the matter of variation first. It is clear from 
the elementary considerations leading to Eq. (7) that fix- 
ation probability can vary greatly with the initial location 
of mutants. For example, if one has in the hub-and-spoke 
model a mutation which is known to be located in the hub, 
it is (I — 1) times more likely to invade the whole popula- 
tion than the same mutation occurring in a spoke, as long 
as the uniform immigration rule is applied. Thus, if the 
number of spokes is very large, one can approach near cer- 
tainty for a mutation to fix by genetic drift alone. However, 
this is balanced by the fact that there are [I — 1) fewer hub 
demes than spoke demes, so a mutation occurring at a ran- 
dom location is as unlikely to fix in this subdivided popu- 
lation as any other with th e same overall size . We c ontrast 
again with the findings of iLieberman et al.l (|2005T ) which 
showed almost certain fixation from a random initial con- 
dition driven by selection in a spatially structured popu- 



lation. Such behaviour can also arise without the need for 
selection as long as one is able to position the initial mu- 
tations strategically, as would be the case when one is con- 
strained by historical data. 

Conversely, we observed only minor variation with the 
initial condition in the mean fixation time, averaged over 
those realisations of the dynamics where fixation actually 
occurs. This was seen both in new exact results for Wright's 
island model (Section 4) and in numerical data for clus- 
tered models (Section 5.2). In fact, in all cases the rela- 
tive magnitude of variation vanishes in the limit of an in- 
finite system, except possibly in the case where the fixa- 
tion time remained constant in that limit. However, even 
there, the variation was of the order of a few percent. This 
lack of variation has two practical benefits. First, one may 
as well approximate any historical data by a random ini- 
tial condition with the same overall frequency of mutants, 
and use Eq. (22) to calculate fixation times from the co- 
alescence times T n which are easy to obtain numerically. 
We remark that this formula can easily be generalised to 
find higher cumulants of the fixation time distribution. The 
second benefit is that if one notices considerable variation 
of fixation times in historical data, it may then be possible 
to rule out genetic drift as a propagation mechanism as a 
consequence. Finally, on the subject of variation, a curios- 
ity that emerged from the exact analysis of Wright's island 
model was evidence for mean time to fixation to be min- 
imised (subject to a constraint of a fixed overall mutant 
frequency) by a random initial condition. It would be in- 
teresting to show this more rigorously, and to see if this is 
also the case for a wider class of models. 

By considering the fate of a single mutation in models 
that had demes grouped into tightly-knit clusters, we estab- 
lished three different growth laws for the fixation time un- 
der various conditions: linear (as in Wright's island model), 
quadratic and approach to a constant. The latter scenario 
is, of course, the most intriguing, since then one has a 
population-level change occurring in an infinite population 
in a finite time through genetic drift alone. The origin of 
this phenomenon lies in the nonconservative nature of the 
migration process in place: the total number of individuals 
entering the spokes from the hub under the constant im- 
migration rule vastly exceeds the number entering the hub. 
This has the consequence that, as one goes backward in 
time, the probability of finding pair of lineages in a vanish- 
ingly small region of space (the hub) remains finite as the 
number of demes is increased; thus their coalescence, and 
hence fixation, can occur in a finite time. By contrast, under 
the uniform emigration rule, the probability that lineages 
are found in separate spokes is finite, and one must wait a 
long time until they are both present in the same deme: this 
drives the superlinear increase in fixation time. Although 
in a biological context the hub-and-spoke model may be of 
limited application, one can think of this enhancement of 
offspring number as a form of spatially-dependent selection 
and it would perhaps be of interest to see if a similar effect 
is evident in potentially more realistic situations where the 



deme sizes are small and their number restricted by topo- 
logical considerations. 

On the other hand, in the cultural context of language 
change where migration corresponds to a speaker retaining 
a record of another's utterances, there is no particular rea- 
son to assume conservative migration. In fact, a uniform 
immigration rule as implemented in this work arises rather 
naturally, as it corresponds to every speaker dividing the 
same amount of attention equally between each person she 
listens to. Furthermore, the hub-and-spoke model probably 
better reflects the nature of social interactions, in which 
some members of society have more long-range connections 
than others. It is unclear, however, if the phenomenon of 
a finite fixation time will be seen for infinite populations 
on more realistic social networks. We are currently investi- 
gating this possibility, and results will be reported in due 
course. 

Finally, we compared results from simulations with pre- 
dictions from a formula for effecti ve popula t ion si ze, Eq. (5) 
whose general form was given by lRousset ( 20041 ) and that 
was specialised here to the extreme slow migration limit in 
which all migration rates are inversely proportional to the 
deme size and have a vanishingly small coefficient. The re- 
sulting formula, (53), was found to give precise predictions 
for the fixation time in the limit of an infinite number of 
demes for all models considered, except when the fixation 
time appears to be bounded by a constant; here, the pre- 
dicted asymptote was seen to be exceeded by as much as 
40%. This would suggest that in this particular case, the fix- 
ation time is not well characterised by the asymptotic coa- 
lescence rate that appears on the right-hand side of Eq. (5). 
To explore this possibility, it is worth considering the ex- 
treme case of two demes with arbitrary migration rates 
m\2 = mi>i2, mix — fni>i\ in the extreme slow migration 
limit to — > 0. If one starts with one lineage per deme in this 
model, the rate of coalescence between these lineages is c = 
TO12 + TO21, since that is the rate at which one of the lineages 
hops from one deme to another, at which time a coalescence 
immediately takes place (as least in the limit to — > 0). By 
contrast, the mean asymptotic rate of coalesence given by 
the asmyptotic effective size formula (53) is c/[2Q\(l — Q\)] 
where Q\ = ra-ixj {m\i +TI21). For any choice of migration 
rates, the true rate of coalescence between the last pair of 
lineages is overestimated by a least a factor of 2, and hence 
the effective population size underestimated as was seen for 
the hub-and-spoke model. This two-demes model provides 
an extreme example of a case where all coalesences occur 
before the onset of the asymptotic regime. We believe this 
is also what is happening in the hub-and-spoke model, and 
could further explain why this is the only case in this work 
in which any variation of fixation time with initial condi- 
tion was observed. Through mo re careful consi derations of 
the relevant coalescence events (|Rousset . 20041 ). one antic- 
ipates that better predictions for the fixation time can be 
obtained. 

We end by remarking that if one is only interested in 
the general scaling of the fixation time with the number of 



demes (and not i n precise estimat es of the coefficients), the 
simple formula of lNaevlakil (|l98dh . iV" 1 = T,i(Q*) 2 / N h is 
much easier to apply than the full expression (5) and can be 
used as long as lineages are well mixed by the dynamics and 
fluctuation effects are unimportant. An example of a model 
in which this simple formula gives the wrong prediction for 
scali ng is the stepping-stone model in two dimensions an d 
less (|Slatkinl . [l99l1lRoussetl[f997tlCox and Durrettl . f2002t ) . 



It is also perhaps worth noting that the stepping-stone 
model in the slow-migration limit m — > is equivalent to 
the particle reactio n system A + A — » A that has been of in- 
terest to physicists (|Pelitil . Il98§ Iben Avrahaml . Il998h . The 
methods that have been employed in this context allow, in 
principle, the distribution of ancestors Q(C\D; t) appearing 
in the general relation (10) to be calculated for the step- 
ping stone model in one dimension. It would be interesting 
to exploit this connection to obtain further new results for 
the properties of fixation in subdivided populations. 
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